AITopics | backward compatibility

Collaborating Authors

backward compatibility

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Shelf Life of Fine-Tuned LLM Judges: Future Proofing, Backward Compatibility, and Question Generalization

Singh, Janvijay, Xu, Austin, Zhou, Yilun, Zhou, Yefan, Hakkani-Tur, Dilek, Joty, Shafiq

arXiv.org Artificial IntelligenceSep-30-2025

The LLM-as-a-judge paradigm is widely used in both evaluating free-text model responses and reward modeling for model alignment and finetuning. Recently, finetuning judges with judge-specific data has emerged as an often preferred choice over directly prompting frontier models as judges, as the former achieves better performance with smaller model sizes while being more robust to common biases. However, the standard evaluation ignores several practical concerns of finetuned judges regarding their real world deployment. In this paper, we identify and formalize three aspects that affect the shelf life of these judges: future proofing and backward compatibility - how well judges finetuned on responses by today's generator models perform on responses by future models or past models, as well as question generalization - how well judges generalize to unseen questions at test time. We study these three aspects in the math domain under a unified framework with varying train and test distributions, three SFT - and DPO-based finetun-ing algorithms and three different base models. Experiments suggest that future-proofing is challenging for most models, while backward compatibility is relatively easy, with DPO-trained models consistently improving performance. We further find that continual learning provides a more balanced adaptation to shifts between older and newer response distributions than training solely on stronger or weaker responses. Moreover, all models observe certain degrees of performance degradation when moving from questions seen during training to unseen ones, showing that current judges do not fully generalize to unseen questions. These findings provide insights into practical considerations for developing and deploying judge models in the face of ever-changing generators. Automatic evaluators have become a central part of the large language model (LLM) development cycle.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.23542

Country:

North America > United States (0.46)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Backward Compatibility in Attributive Explanation and Enhanced Model Training Method

Matsuno, Ryuta

arXiv.org Artificial IntelligenceAug-5-2024

Model update is a crucial process in the operation of ML/AI systems. While updating a model generally enhances the average prediction performance, it also significantly impacts the explanations of predictions. In real-world applications, even minor changes in explanations can have detrimental consequences. To tackle this issue, this paper introduces BCX, a quantitative metric that evaluates the backward compatibility of feature attribution explanations between pre- and post-update models. BCX utilizes practical agreement metrics to calculate the average agreement between the explanations of pre- and post-update models, specifically among samples on which both models accurately predict. In addition, we propose BCXR, a BCX-aware model training method by designing surrogate losses which theoretically lower bounds agreement scores. Furthermore, we present a universal variant of BCXR that improves all agreement metrics, utilizing L2 distance among the explanations of the models. To validate our approach, we conducted experiments on eight real-world datasets, demonstrating that BCXR achieves superior trade-offs between predictive performances and BCX scores, showcasing the effectiveness of our BCXR methods.

agreement metric, explanation, normdisagree, (12 more...)

arXiv.org Artificial Intelligence

2408.02298

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models

Jang, Young Kyun, Lim, Ser-nam

arXiv.org Artificial IntelligenceMay-23-2024

Modern retrieval systems often struggle with upgrading to new and more powerful models due to the incompatibility of embeddings between the old and new models. This necessitates a costly process known as backfilling, which involves re-computing the embeddings for a large number of data samples. In vision, Backward-compatible Training (BT) has been proposed to ensure that the new model aligns with the old model's embeddings. This paper extends the concept of vision-only BT to the field of cross-modal retrieval, marking the first attempt to address Cross-modal BT (XBT). Our goal is to achieve backward-compatibility between Vision-Language Pretraining (VLP) models, such as CLIP, for the cross-modal retrieval task. To address XBT challenges, we propose an efficient solution: a projection module that maps the new model's embeddings to those of the old model. This module, pretrained solely with text data, significantly reduces the number of image-text pairs required for XBT learning, and, once it is pretrained, it avoids using the old model during training. Furthermore, we utilize parameter-efficient training strategies that improve efficiency and preserve the off-the-shelf new model's knowledge by avoiding any modifications. Experimental results on cross-modal retrieval datasets demonstrate the effectiveness of XBT and its potential to enable backfill-free upgrades when a new VLP model emerges.

compatibility, new model, vlp model, (17 more...)

arXiv.org Artificial Intelligence

2405.14715

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

GitHub - GPflow/GPflow: Gaussian processes in TensorFlow

#artificialintelligenceApr-1-2023, 00:45:51 GMT

GPflow is a package for building Gaussian process models in Python. It implements modern Gaussian process inference for composable kernels and likelihoods. GPflow builds on TensorFlow 2.4 and TensorFlow Probability for running computations, which allows fast execution on GPUs. The online documentation (latest release)/(develop) contains more details. It was originally created by James Hensman and Alexander G. de G. Matthews.

gpflow, gpflow 2, tensorflow, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Modeling & Simulation (0.83)

Add feedback

How I Refactored a Monolithic Code Base Into an Add-In Architecture

#artificialintelligenceFeb-19-2023, 11:00:32 GMT

Before my first professional job, I would listen to developers talk about what it was like to work on someone else's messy code that consisted of anti-patterns. They would tell horror stories. Then, I took my second assignment as a fresh Dotnet developer, and that horror was exactly what I had been scared of. My new job was to integrate engineering rule sets into an engineering application. The application was already developed and running with a library with three rule sets.

application, assembly, dsl, (11 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)

Add feedback

Why I switched from console gaming to PC gaming

PCWorldFeb-15-2023, 11:30:00 GMT

I've been a gamer pretty much all of my life. When I was a kid, my dad taught me how to navigate the puzzles and defeat the bosses in Legend of Zelda: Link to the Past on the original Nintendo. Those are some of my earliest childhood memories, which is why console gaming will always have a special place in my heart. That said, now that I'm in my mid-thirties and value comfort and convenience above all else, I've mostly switched to PC gaming these days. I'm not going to lie, my argument for switching to PC gaming is mostly rooted in my persnickety personality.

console gaming, gaming, pc gaming, (4 more...)

PCWorld

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Games > Computer Games (0.32)

Add feedback

Improving Prediction Backward-Compatiblility in NLP Model Upgrade with Gated Fusion

Lai, Yi-An, Mansimov, Elman, Xie, Yuqing, Zhang, Yi

arXiv.org Artificial IntelligenceFeb-3-2023

When upgrading neural models to a newer version, new errors that were not encountered in the legacy version can be introduced, known as regression errors. This inconsistent behavior during model upgrade often outweighs the benefits of accuracy gain and hinders the adoption of new models. To mitigate regression errors from model upgrade, distillation and ensemble have proven to be viable solutions without significant compromise in performance. Despite the progress, these approaches attained an incremental reduction in regression which is still far from achieving backward-compatible model upgrade. In this work, we propose a novel method, Gated Fusion, that promotes backward compatibility via learning to mix predictions between old and new models. Empirical results on two distinct model upgrade scenarios show that our method reduces the number of regression errors by 62% on average, outperforming the strongest baseline by an average of 25%.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2302.0208

Country:

Europe (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Backward Compatibility During Data Updates by Weight Interpolation

Schumann, Raphael, Mansimov, Elman, Lai, Yi-An, Pappas, Nikolaos, Gao, Xibin, Zhang, Yi

arXiv.org Artificial IntelligenceJan-25-2023

Backward compatibility of model predictions is a desired property when updating a machine learning driven application. It allows to seamlessly improve the underlying model without introducing regression bugs. In classification tasks these bugs occur in the form of negative flips. This means an instance that was correctly classified by the old model is now classified incorrectly by the updated model. This has direct negative impact on the user experience of such systems e.g. a frequently used voice assistant query is suddenly misclassified. A common reason to update the model is when new training data becomes available and needs to be incorporated. Simply retraining the model with the updated data introduces the unwanted negative flips. We study the problem of regression during data updates and propose Backward Compatible Weight Interpolation (BCWI). This method interpolates between the weights of the old and new model and we show in extensive experiments that it reduces negative flips without sacrificing the improved accuracy of the new model. BCWI is straight forward to implement and does not increase inference cost. We also explore the use of importance weighting during interpolation and averaging the weights of multiple new models in order to further reduce negative flips.

machine learning, natural language, new model, (18 more...)

arXiv.org Artificial Intelligence

2301.10546

Country: North America > United States (0.28)

Genre: Research Report (0.83)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

On the Model of Computation: Counterpoint

Communications of the ACMAug-20-2022, 13:00:09 GMT

Andy Grove (Intel's business leader until 2004) termed "software spiral" the exceptionally resilient business model behind general-purpose CPUs. Application software is the defining component of SWS: Code written once could yet benefit from performance scaling of later CPU generations. SWS is comprised of several abstraction levels. The random access machine, or model (RAM) is most relevant for the current Counterpoint Viewpoint (CPV): each serial step of an algorithm features a basic operation taking unit time ("uniform cost" criterion). The RAM has long been the gold standard for algorithms and data structures.

algorithm, parallelism, programmer, (16 more...)

Communications of the ACM

Country:

North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
Asia > Taiwan (0.04)

Industry:

Government (0.69)
Education > Educational Setting > K-12 Education (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

An Empirical Analysis of Backward Compatibility in Machine Learning Systems

Srivastava, Megha, Nushi, Besmira, Kamar, Ece, Shah, Shital, Horvitz, Eric

arXiv.org Machine LearningAug-11-2020

In many applications of machine learning (ML), updates are performed with the goal of enhancing model performance. However, current practices for updating models rely solely on isolated, aggregate performance analyses, overlooking important dependencies, expectations, and needs in real-world deployments. We consider how updates, intended to improve ML models, can introduce new errors that can significantly affect downstream systems and users. For example, updates in models used in cloud-based classification services, such as image recognition, can cause unexpected erroneous behavior in systems that make calls to the services. Prior work has shown the importance of "backward compatibility" for maintaining human trust. We study challenges with backward compatibility across different ML architectures and datasets, focusing on common settings including data shifts with structured noise and ML employed in inferential pipelines. Our results show that (i) compatibility issues arise even without data shift due to optimization stochasticity, (ii) training on large-scale noisy datasets often results in significant decreases in backward compatibility even when model accuracy increases, and (iii) distributions of incompatible points align with noise bias, motivating the need for compatibility aware de-noising and robustness methods.

accuracy, backward compatibility, noise, (13 more...)

arXiv.org Machine Learning

2008.04572

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.93)
Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.34)

Add feedback